Principled Tools for Modeling and Visualizing 2D Vector Fields
ggvfields can be understood by analogy with core ggplot2 concepts
Real-valued functions of real arguments \(f: \mathbb{R}\to \mathbb{R}\) are familiar from calculus
Their graphs are fundamental to their understanding (eg. \(f(x) = \sin(x)\))
ggvfields can be understood by analogy with core ggplot2 concepts
Vector fields are vector-valued functions of vector arguments \(f: \mathbb{R}^{n} \to \mathbb{R}^{m}\)
In this talk we’ll assume \(n = m = 2\), a common case in applications
Thought of in various ways
\(\displaystyle{\textbf{f}(x,y) = \textbf{f}\left(\left[\begin{matrix}x \\ y \end{matrix}\right]\right) = \left[\begin{matrix}f_{1}(x,y) \\ f_{2}(x,y) \end{matrix}\right]}\) “component functions”
\[\textbf{x} \stackrel{\textbf{f}}{\longmapsto} \textbf{y} \in \mathbb{R}^{2}\]
Examples:
\(\displaystyle{\textbf{f}(x,y) = \textbf{f}\left(\left[\begin{matrix}x \\ y \end{matrix}\right]\right) = \left[\begin{matrix}-y \\ x \end{matrix}\right]}\)
\(\displaystyle{\nabla \ell(\mu,\sigma^{2}) = \left[\begin{matrix}\frac{\partial}{\partial\mu} \ell(\mu,\sigma^{2}) \\ \frac{\partial}{\partial\sigma^{2}} \ell(\mu,\sigma^{2}) \end{matrix}\right]}\)
f <- function(u) {
x <- u[1]; y <- u[2]
c(-y, x)
}
N <- 11
df <- expand.grid( "x" = seq(-1, 1, len = N), "y" = seq(-1, 1, len = N) )
df$fy <- df$fx <- NA
for (i in 1:nrow(df)) df[i,3:4] <- f( as.numeric(df[i,1:2]) )
df <- transform(df, "radius" = sqrt(fx^2 + fy^2), "angle" = atan2(fy, fx))
head(df)
# x y fx fy radius angle
# 1 -1.0 -1 1 -1.0 1.414214 -0.7853982
# 2 -0.8 -1 1 -0.8 1.280625 -0.6747409
# 3 -0.6 -1 1 -0.6 1.166190 -0.5404195
# 4 -0.4 -1 1 -0.4 1.077033 -0.3805064
# 5 -0.2 -1 1 -0.2 1.019804 -0.1973956
# 6 0.0 -1 1 0.0 1.000000 0.0000000ggfields
Mathematica 11.3
Mathematica 13.1
Challenges with results
Unattractive results Could be more visually appealing
Not informative Lacks clarity or detail
Challenges with code
Verbose syntax Very clunky and wordy
High user burden Lots of effort and mental load
Difficult to customize Hard to see now; examples to come
Not extensible Faceting? Superimposing different colors? etc.
Ridged Accept data one way
The results should be…
Correct Does what it claims to do
Clear Easily understandable
Attractive Visually appealing
The code should be…
Simple Straightforward and concise
Familiar Easy for a ggplot user
Flexible Accept functions or data of any form
Install from CRAN
Install from Github
Load the package
How should we think about this:
x, y, fx, fyx, y, xend, yendx, y, angle, directionx, y, angle_deg, directionPoints to Ponder
Having said that…
ggvfields is a stream visualization package.
Consider the vector field
\[ \textbf{f}(x,y) = (-y,\,x) \]
which yields the ODE system
\[ \frac{dx}{dt} = -y,\quad \frac{dy}{dt} = x \]
Euler’s method estimates the ODE solution by
\[ \mathbf{v}_{n+1} = \mathbf{v}_n + \Delta t\,\textbf{f}(\mathbf{v}_n) \]
Starting at \(\mathbf{v}_0 = (1,1)\) and using \(\Delta t = 0.1\):
Iteration 1:
\(\textbf{f}(1,1) = (-1,\,1)\), so
\(\textbf{v}_1 = (1,1) + 0.1\,(-1,1) = (0.9,\,1.1)\).
Iteration 2:
\(\textbf{f}(0.9,1.1) = (-1.1,\,0.9)\), so
\(\textbf{v}_2 = (0.9,1.1) + 0.1\,(-1.1,0.9) = (0.9-0.11,\,1.1+0.09) = (0.79,\,1.19)\).
# Integrate the system using ode
(stream <- ode(init, times, f_ode, parms = NULL, method = "euler"))
# time x y
# 1 0.0 1.00000000 1.000000
# 2 0.1 0.90000000 1.100000
# 3 0.2 0.79000000 1.190000
# 4 0.3 0.67100000 1.269000
# 5 0.4 0.54410000 1.336100
# 6 0.5 0.41049000 1.390510
# 7 0.6 0.27143900 1.431559
# 8 0.7 0.12828310 1.458703
# 9 0.8 -0.01758719 1.471531
# 10 0.9 -0.16474031 1.469772
# 11 1.0 -0.31171756 1.453298Reminder: A vector is a special case of a stream
init <- c(x = 1, y = 1); times <- seq(0, 1, by = 1)
vector <-
data.frame(ode(init, times, f_ode, parms = NULL, method = "euler")) |>
pivot_wider(names_from = time, values_from = c(x, y)) |>
mutate(fx = x_1 - x_0, fy = y_1 - y_0, angle = atan2(fy, fx), angle_deg = angle * 180 / pi,distance = sqrt(fx^2 + fy^2))What if we have a lot of streams?
generate_path <- function(u, step) {
init <- unname(c(x = u[1], y = u[2])); times <- seq(0, 1, by = step)
df <- data.frame(ode(init, times, f_ode, parms = NULL, method = "euler"))
colnames(df) <- c("time", "x", "y")
df
}
streams <-
expand.grid(x = seq(-1,1,.5), y = seq(-1,1,.5)) |>
apply(1, generate_path, step = .1) |> bind_rows(.id = "id")
head(streams)
# id time x y
# 1 1 0.0 -1.00000 -1.00000
# 2 1 0.1 -0.90000 -1.10000
# 3 1 0.2 -0.79000 -1.19000
# 4 1 0.3 -0.67100 -1.26900
# 5 1 0.4 -0.54410 -1.33610
# 6 1 0.5 -0.41049 -1.39051What if we have a lot of streams?
What if we have a lot of vectors? Vectors are special cases of streams.
generate_path <- function(u, step) {
init <- unname(c(x = u[1], y = u[2])); times <- seq(0, 1, by = step)
df <- data.frame(ode(init, times, f_ode, parms = NULL, method = "euler"))
colnames(df) <- c("time", "x", "y")
df
}
vectors <-
expand.grid(x = seq(-1,1,.5), y = seq(-1,1,.5)) |>
apply(1, generate_path, step = 1) |> bind_rows(.id = "id")
head(vectors)
# id time x y
# 1 1 0 -1.0 -1.0
# 2 1 1 0.0 -2.0
# 3 2 0 -0.5 -1.0
# 4 2 1 0.5 -1.5
# 5 3 0 0.0 -1.0
# 6 3 1 1.0 -1.0What if we had a lot of vectors?
Reminder: A vector is a special case of a stream
Reminder: A vector is a special case of a stream
center = TRUERecall the ODE system: \[ \textbf{f}(x,y) = (-y,\,x) \quad \text{(i.e., } \frac{dx}{dt}=-y,\; \frac{dy}{dt}=x\text{)} \]
Euler’s method estimates the solution: \[ \mathbf{v}_{n+1} = \mathbf{v}_n + \Delta t\,\mathbf{f}(\mathbf{v}_n) \]
center = FALSE:
\(\Delta t\): from \(0\) to \(T\)
center = TRUE:
\(+ \Delta t\): from \(0\) to \(T/2\)
\(- \Delta t\): from \(0\) to \(-T/2\)
center = TRUEcenter_backward <- ode(init, times = seq(0, -0.5, by = -0.1), f_ode, parms = NULL, method = "euler")
center_backward <- center_backward[nrow(center_backward):1, ]
center_forward <- ode(init, times = seq(0.1, 0.5, by = 0.1), f_ode, parms = NULL, method = "euler")
(center <- rbind(center_backward, center_forward))
# time x y
# [1,] -0.5 1.39051 0.41049
# [2,] -0.4 1.33610 0.54410
# [3,] -0.3 1.26900 0.67100
# [4,] -0.2 1.19000 0.79000
# [5,] -0.1 1.10000 0.90000
# [6,] 0.0 1.00000 1.00000
# [7,] 0.1 1.00000 1.00000
# [8,] 0.2 0.90000 1.10000
# [9,] 0.3 0.79000 1.19000
# [10,] 0.4 0.67100 1.26900
# [11,] 0.5 0.54410 1.33610center = TRUEcenter = TRUELnormalize = FALSEFor normalize = TRUE:
- Default: Streams grow to length L provided by user or determined allegorically.
For normalize = FALSE:
- Default: Streams grow to length L, then all are trimmed to the fastest time T.
- Customization: Set T to control stream duration.
center = FALSEWithout centering, an arrow is drawn from \(f(x,y)\) to \(f(x,y) + f(x,y)\)
When centered, an arrow is drawn from \(f(x,y) - \frac{1}{2} f(x,y)\) to \(f(x,y) + \frac{1}{2} f(x,y).\)
LWhen normalize = TRUE: Vectors are scaled to a fixed length \(L\) (either user-specified or computed automatically).
normalize = FALSEWhen normalize = TRUE: Vectors are scaled to a fixed length \(L\) (either user-specified or computed automatically).
When normalize = FALSE: Vectors retain their natural magnitude—that is, they start at \(f(x,y)\) and extend to \(f(x,y) + f(x,y)\) (with \(\Delta T = 1\)).
Consider the scalar function
\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]
Its gradient is a vector field is given by
\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]
f <- function(u) {
x <- u[1]; y <- u[2]
(1/2) * (x^2 + y^2)
}
p1 <- ggplot() + ggfunction::geom_function_2d_1d(fun = f) ## sneak peak of chapter 3
p2 <- ggplot() + geom_gradient_field(fun = f)
p3 <- ggplot() + ggfunction::geom_function_2d_1d(fun = f) + geom_gradient_field(fun = f)
p1 + p2 + p3 + coord_equal() + plot_layout(guides = "collect")Consider the scalar function
\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]
Its gradient is a vector field is given by
\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]
Express the potential function as a line integral:
\[ \phi(x,y)=\int_{(x_0,y_0)}^{(x,y)} \nabla \phi \cdot (dx,dy). \]
Integrate the x-component: \[ \phi(x,y)=\int x\,dx = \frac{1}{2}x^2 + g(y) \]
Differentiate with respect to y: \[ \frac{\partial}{\partial y}\phi(x,y)=\frac{\partial\phi}{\partial y}\left(\frac{1}{2}x^2+g(y)\right) = \frac{\partial\phi}{\partial y}g(y) = \frac{\partial \phi}{\partial y}=g'(y) = y \]
Solve for g(y) by integrating: \[ g(y) = \int g'(y)\,dy = \int y\,dy = \frac{1}{2}y^2 + C \]
So: \[ \phi(x,y) = \frac{1}{2}x^2+g(y) = \frac{1}{2}x^2+\frac{1}{2}y^2+C \]
Consider the scalar function
\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]
Its gradient is a vector field is given by
\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]
ggplot(rand_dat |> print(n = 6)) +
geom_vector(aes(x = x, y = y, fx = fx, fy = fy), normalize = FALSE)
# # A tibble: 100 × 4
# x y fx fy
# <dbl> <dbl> <dbl> <dbl>
# 1 -98.4 24.9 6.71 -3.85
# 2 -91.6 30.1 -4.42 37.7
# 3 -91.1 30.1 0.00252 34.3
# 4 -96.7 31.6 21.2 -20.4
# 5 -98.5 24.8 7.02 -3.46
# 6 -99.8 33.2 0 0
# # ℹ 94 more rows\(\mathbf{Y} = \mathbf{X}\mathbf{B} + \mathbf{E},\)
where
The least squares estimator is
\(\widehat{\mathbf{B}} = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{Y}.\)
\[ \boldsymbol{\Sigma}_{\widehat{\mathbf{v}}} = \mathbf{X}_{\text{new}}\,\mathbf{V}\,\mathbf{X}_{\text{new}}^\top + \boldsymbol{\Sigma}, \quad \mathbf{V} = \sigma^2 (\mathbf{X}^\top \mathbf{X})^{-1}. \] Eigen-decomposition:
\[ \boldsymbol{\Sigma}_{\widehat{\mathbf{v}}} = \mathbf{Q}\,\boldsymbol{\Lambda}\,\mathbf{Q}^\top, \quad \boldsymbol{\Lambda} = \operatorname{diag}(\lambda_1,\lambda_2). \]
Semi-axis lengths and orientation:
\[ a_i = \sqrt{\lambda_i\,\chi^2_{1-\alpha}}, \quad i=1,2, \qquad \theta = \arctan\left(\frac{q_{2,1}}{q_{1,1}}\right). \]
\[ r = \sqrt{f_x^2 + f_y^2}, \quad \theta = \operatorname{atan2}(f_y, f_x), \] with Jacobian determinant \(|r|\). The joint density becomes: \[ f_{r,\theta}(r,\theta) = f_{f_x,f_y}(r\cos\theta,\,r\sin\theta)\,|r|. \] Marginalize over r to get angular density:
\[ f_\theta(\theta) = \int_0^\infty f_{r,\theta}(r,\theta)\,dr, \qquad F_\theta(\theta) = \int_{-\pi}^\theta f_\theta(u)\,du. \] Confidence bounds for angular uncertainty:
\[ F_\theta(\theta_{\text{lower}}) = \frac{\alpha}{2}, \quad F_\theta(\theta_{\text{upper}}) = 1 - \frac{\alpha}{2}. \]
Cokriging is a technique that extends ordinary kriging to jointly predict multiple, correlated spatial variables. It leverages the spatial cross-correlation between the variables to improve prediction accuracy.
GAMs extend linear models by allowing non-linear relationships between predictors and the response via smooth functions. In spatial statistics, a GAM typically models the response as:
\[ y = \beta_0 + f(x,y) + \varepsilon, \]
where \(f(x,y)\) is a smooth function estimated from the data.
ggplot(rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
geom_vector_smooth(eval_points = data.frame(x = -90.8, y = 26.25), method = "gam") +
geom_vector_smooth(eval_points = data.frame(x = -90.8, y = 26.25), method = "lm") +
geom_vector_smooth(eval_points = data.frame(x = -90.8, y = 26.25), method = "kriging") +
geom_vector(alpha = .3, color = "black")
# Linear Model of Coregionalization found. Good.
# [using ordinary cokriging]What if we saw this regression equation as a vector field? After all, it takes in x/y data and outputs fx/fy data?
vec_field <- function(u) {
model <- lm(cbind(fx,fy)~x*y, data = rand_dat)
newdata <- data.frame(x = u[1], y = u[2])
as.numeric(predict(model, newdata = newdata))
}
ggplot() +
geom_vector(data = rand_dat, aes(x = x, y = y, fx = fx, fy = fy), alpha = .2) +
geom_vector_field(data = rand_dat, fun = vec_field, mapping = aes(x = x, y = y, fx = fx, fy = fy))scalar_field <- function(u) {
mod <- lm(z ~ x + y + poly(x,2) + poly(y,2), data = sample_data)
pred <- predict(mod, newdata = data.frame(x = u[1], y = u[2]))
as.numeric(pred)
}
# plot the scalar data with the estimated gradient
ggplot(grid) +
geom_raster(aes(x = x, y = y, fill = z)) +
scale_fill_gradient(low = "white", high = "black") +
geom_gradient_field(fun = scalar_field) +
scale_color_viridis_c()# Reading layer `Elev_Contour' from data source
# `C:\Users\Dusty_Turner1\OneDrive - Baylor University\r_projects\dissertation-turner\geom_vector_smooth\ELEV_Waco_W_TX_1X1_Shape\Shape\Elev_Contour.shp'
# using driver `ESRI Shapefile'
# Simple feature collection with 18500 features and 14 fields
# Geometry type: MULTILINESTRING
# Dimension: XY
# Bounding box: xmin: -98 ymin: 31 xmax: -97 ymax: 32
# Geodetic CRS: NAD83
ggplot2 Layers Expect Dataggplot2 Layers Expect Datageom_point()
geom_line()
geom_segment()
geom_smooth()
geom_stream()
ggplot2 Layers Expect Functionsgeom_function()
geom_stream_field()
Install from Github
Load the package
Probability Density Function (PDF):
A PDF \(f_X(x)\) satisfies
\(\Pr(a \le X \le b) = \int_a^b f_X(x) \, dx, \quad \int_{-\infty}^{\infty} f_X(x) \, dx = 1\) For example, the standard normal PDF is:
\(f(x) = \frac{1}{\sqrt{2\pi}} \exp\left(-\frac{x^2}{2}\right)\)
Example: Using ggfunction, you can plot the normal PDF with automated shading of a quantile region:
LTC Dusty Turner